Processor-Directed Cache Coherence Mechanism – A Performance Study

نویسندگان

  • H. Sarojadevi
  • S. K. Nandy
چکیده

Cache coherent multiprocessor architecture is widely used in the recent multi-core systems, embedded systems and massively parallel processors. With the ever increasing performance gap between processor and memory, there is a requirement for an optimal cache coherence mechanism in a cache coherent multiprocessor. The conventional directory based cache coherence scheme used in large scale multiprocessors suffers from considerable overhead. To overcome this problem we have developed a compiler assisted, processor directed cache coherence mechanism and evaluated. The approach is autoinvalidation based one that uses a hardware buffer termed Coherence Buffer (CB) and there is no need for directory. The CB method is compared in this paper with a self-invalidation based directory approach that employs a last touch predictor (LTP). Detailed architectural simulations of Distributed Shared Memory configurations with superscalar processors show that 8-entry 4-way associative CB performs better than the LTP based self-invalidation method as well as full-map 3-hop directory for five of the SPLASH-2 benchmarks under release consistency memory model. Given its performance, cost, complexity and scalability advantages, the CB approach is found to be promising approach for emerging applications in large scale multiprocessors, multi-core systems, and transaction processing systems. KeywordsCache coherence, Distributed shared memory multiprocessor system, self-invalidation, Last touch predictor, Release consistency

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiprocessor Cache Coherence : The

The performance of large-scale shared-memory multiprocessors can be greatly improved if they can cache remote shared data in the private caches of the processors. However, maintaining cache coherence for such systems remains a challenge. Although hardware directory schemes give good performance, they might be too complicated and expensive for large-scale multiprocessors. This tutorial article p...

متن کامل

forCompiler - DirectedCache Coherence

I n recent years, rapid advances in high-performance microprocessor design technology have made it possible to build large-scale multiprocessors with a theoretically very high peak computational performance. Unfortunately, there is no corresponding improvement in memory speed and communication bandwidth. In fact, the gap between processor speed and memory speed will become even wider if the pre...

متن کامل

Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

The quest to improve performance forces designers to explore finer-grained multiprocessor machines. Ever increasing chip densities based on CMOS improvements fuel research in highly parallel chip multiprocessors with 100s of processing elements. With such increasing levels of parallelism, synchronization is set to become a major performance bottleneck and efficient support for synchronization a...

متن کامل

Case Study: On-Demand Coherent Cache for Avionic Applications

In hard real-time systems, such as avionics, there is demand for high performance. A way to meet performance demands is by parallel computation on multicore systems. A main contributor for the application performance on multicore systems is size and type of cache memory. For multicore hard real-time systems, the usage of cache memory is problematic. Time-critical applications need reasonable wo...

متن کامل

Hybrid Shared-aware Cache Coherence Transition Strategy

Chip-multiprocessors have played a significant role in real parallel computer architecture design. For integrating tens of cores into a chip, designs tend towards with physically distributed last level caches. This naturally results in a Non-Uniform Cache Access design, where on-chip access latencies depend on the physical distances between requesting cores and home cores where the data is cach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011